\chapter{Testing and Evaluations}
\label{chap:07TestingEval}
Testing is a crucial part of every software project. In this chapter, we will look at tests used during development of Visual Profiler. Next, we will evaluate performance overhead and correctness of the profiler.

\section{Testing}
During the development we employed various testing strategies based on the nature of a particular project. 

To test the profiler on a more realistic application we used a multithreaded application rendering Mandelbrot-set (a fractal). It was developed as a class project in 2008. The application and its source code are available without any major changes on the attached DVD. 

\section{Testing of Visual Profiler Backend}
The backend part is written in C++. Unfortunately, our skills and experience in the C++ platform did not allow us to write unit tests. Therefore the testing of functionality was accomplished manually. 

A set of .NET simple testing applications were created to monitor  the profiler's behavior. The backend always transformed output call trees into a textual representation that allowed us to verify the expected values, such as call orders, methods hit counts, metadata collections and so on. We also focused on stability of the profiler in cases of exceptions.

Invaluable help was the possibility to debug the profiler backend. Thanks to this we could analyze many problematic areas in the Profiling API.

\section{Testing of Visual Profiler Access and UI}
Over seventy unit tests verify the functionality and the correctness of the Access and the UI. We chose the NUnit\footnote{\href{http://www.nunit.org/}{http://www.nunit.org/}} as a test framework and used it together with the Resharper test runner\footnote{\href{http://www.jetbrains.com/resharper/features/unit_testing.html}{http://www.jetbrains.com/resharper/features/unit\_testing.html}}. It was a great choice since it was a very productive environment. Resharper even allows debugging the unit tests.

The software architecture of the Access and the UI had to be adjusted to allow testing. We did not develop according test driven development paradigm, but we developed ``with tests in mind''. The decoupling of classes and introduction of interfaces helped us to ease the testing. To mock dependencies we used a mocking library Moq\footnote{ \href{http://code.google.com/p/moq/}{http://code.google.com/p/moq/}}. Moq is developed to support newest features in .NET and takes full advantage of features like expression trees to easily and type-safely mock behavior. 

The UI was covered by unit test only partially, mainly the model part. The biggest reason for that was lack of time in the end of the project. Nevertheless, the Model-View-ViewModel pattern employed in the UI can make testing of the UI logic almost a trivial task. 

The Visual Studio Package project template for creating extensibility offers an option of creating an integration test in the initial wizard. Since our extension is a small, proof-of-concept like project, we did use it. 

The final user interface was tested by hand. We focused on layout issues, correct functionality and clear logics of the UI.

The installation of the Visual Studio SDK adds an experimental instance of Visual Studio. What this means is that during an extension development, there is no need to risk corruption of the development environment by testing the developed extension in the very same development environment. You can just test it in an independent instance of Visual Studio - an experimental instance. The experimental instance has its own registry and data storage isolated from the main instance. This can even be reset back to what it was after the Visual Studio installation and thus roll back all changes made to the experimental instance. Extensions deployed to the experimental instance can be debugged from the main instance just by hitting F5 - what a great way to develop. 



\section{Evaluation of overhead}
The overhead imposed on a profiled application is one of the main characteristics of profilers. The overhead greatly depends on the profiling mode and granularity as discussed in the chapter \ref{01ProfModes}.

We conducted measurements of both, the tracing and the sampling mode, profilers on the Mandelbrot fractal rendering application and on a Fibonacci sequence application, created only for the purpose of the measurement. We chose the Fibonacci sequence because of its high recursion, which can very well demonstrate the difference between the overhead of both profiling modes. The program runs 100 iteration of Fibonacci of 30. The Mandelbrot application in this case represents a ``normally'' behaving application with loops and function calls. The results are presented in the table \ref{07tbl:compareResults}. 

\begin{table}
\centering
    \begin{tabular}{|l|r|r|r|}
        \hline
        ~                  & without profiling & tracing mode & sampling mode \\ \hline
        Fibonacci sequence & 1                 & 110,11       & 1,13          \\ 
        Mandelbrot fractal & 1                 & 1,77         & 1,09          \\
        \hline
    \end{tabular}
    \caption{The multiples of application duration for the tracing and the sampling modes. }
    \label{07tbl:compareResults}
\end{table}
 
As expected, the tracing mode copes poorly with extensive method calls, whereas it makes the application run about 80 \% slower in the normal-scenario. On the other hand the sampling mode imposes constant overhead around 10 \% regardless of method call intensity.

\section{Evaluation of exactness}
The exactness of profiling results depends on the profiling mode and granularity as discussed in the chapter \ref{01ProfModes}. The tracing mode results are very accurate and fully match runtime behavior of an application. Compared to that, the sampling mode results only show trends in the applications. Some method invocations are not even detected. To prove that we ran the Mandelbrot fractal application for both modes and counted the number of methods in the results. The table \ref{07tbl:compareResultsExactness} revels that the sampling mode lags behind the tracing mode with only around 70 \% of detected methods.

\begin{table}[h!]
\centering
    \begin{tabular}{|l|r|}
        \hline
        Tracing mode  & 32 methods \\ \hline
        Sampling mode & 23 methods \\
        \hline
    \end{tabular}
     \caption{The number of detected methods in a profiling session. }
    \label{07tbl:compareResultsExactness}
\end{table}

\section{Impact of assembly filtering on results}
The section \ref{subsubsec:04ProfEnabAssem} about profiling enabled assemblies discusses reasons for profiling only some assemblies. Leaving out some assemblies from profiling may, nonetheless, hide some important details about method calls and affect the profiling results. 

A problem may occur if a method \texttt{A} from a profiled assembly (profiled method) calls a method \texttt{B} from non-profiled assembly (non-profiled method) and the non-profiled method \texttt{B} then calls a profiled method \texttt{C}. The real call tree is A-B-C but the profiler records A-C, because the non-profiled method did not cause a transition in a call tree structure of the profiler. 

An interesting effect of this event can be observed in the Mandelbrot test application by measuring the active time (user plus kernel CPU time). The \texttt{ApplicationMessageLoop} method is called once and starts only a form (window). It has two lines of code. Interestingly, it ranks as the 3rd most active method. A full, non-filtered profiler run proved our hypothesis about the application message loop acting as the non-profiled method B from previous paragraph. The application actually handles mouse-move events to display coordinates. The profiled \texttt{ApplicationMessageLoop} method starts the non-profiled message loop and the non-profiled message loop invokes the profiled mouse-move handler each time the mouse moves. When the handler returns, it updates time counter of the \texttt{ApplicationMessageLoop} method, instead of the counter of the message loop, because the profiler does not have a clue about the message loop methods and its counter.

\section{Summary}
The different approaches to testing and ensuring the application correctness, such as unit testing and pure hand testing, were introduced in this chapter. We also pointed out our testing imperfections and attributed them to the lack of time.

After that, we revealed the non-surprising result of the comparison of the profiling modes. The sampling mode is clearly faster and not influenced by structure of an application, whereas the tracing profiler provides more precise data.

Lastly, we looked at an interesting variation of results caused by the assembly filtering.

